Winkler County
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- North America > United States > Texas > Winkler County (0.04)
- North America > Canada > Quebec > Montreal (0.04)
No Bells, Just Whistles: Sports Field Registration by Leveraging Geometric Properties
Gutiérrez-Pérez, Marc, Agudo, Antonio
Broadcast sports field registration is traditionally addressed as a homography estimation task, mapping the visible image area to a planar field model, predominantly focusing on the main camera shot. Addressing the shortcomings of previous approaches, we propose a novel calibration pipeline enabling camera calibration using a 3D soccer field model and extending the process to assess the multiple-view nature of broadcast videos. Our approach begins with a keypoint generation pipeline derived from SoccerNet dataset annotations, leveraging the geometric properties of the court. Subsequently, we execute classical camera calibration through DLT algorithm in a minimalist fashion, without further refinement. Through extensive experimentation on real-world soccer broadcast datasets such as SoccerNet-Calibration, WorldCup 2014 and TS- WorldCup, our method demonstrates superior performance in both multiple- and single-view 3D camera calibration while maintaining competitive results in homography estimation compared to state-of-the-art techniques.
- North America > United States > Texas > Winkler County (0.04)
- North America > Canada > Saskatchewan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain (0.04)
ScreenAgent: A Vision Language Model-driven Computer Control Agent
Niu, Runliang, Li, Jindong, Wang, Shiqi, Fu, Yali, Hu, Xiyu, Leng, Xueyuan, Kong, He, Chang, Yi, Wang, Qi
Existing Large Language Models (LLM) can invoke a variety of tools and APIs to complete complex tasks. The computer, as the most powerful and universal tool, could potentially be controlled directly by a trained LLM agent. Powered by the computer, we can hopefully build a more generalized agent to assist humans in various daily digital works. In this paper, we construct an environment for a Vision Language Model (VLM) agent to interact with a real computer screen. Within this environment, the agent can observe screenshots and manipulate the Graphics User Interface (GUI) by outputting mouse and keyboard actions. We also design an automated control pipeline that includes planning, acting, and reflecting phases, guiding the agent to continuously interact with the environment and complete multi-step tasks. Additionally, we construct the ScreenAgent Dataset, which collects screenshots and action sequences when completing a variety of daily computer tasks. Finally, we trained a model, ScreenAgent, which achieved computer control capabilities comparable to GPT-4V and demonstrated more precise UI positioning capabilities. Our attempts could inspire further research on building a generalist LLM agent. The code is available at \url{https://github.com/niuzaisheng/ScreenAgent}.
- North America > United States > Texas > Winkler County (0.04)
- North America > United States > Texas > Loving County (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Workflow (0.89)
- Research Report (0.64)
Vision-aided UAV navigation and dynamic obstacle avoidance using gradient-based B-spline trajectory optimization
Xu, Zhefan, Xiu, Yumeng, Zhan, Xiaoyang, Chen, Baihan, Shimada, Kenji
Navigating dynamic environments requires the robot to generate collision-free trajectories and actively avoid moving obstacles. Most previous works designed path planning algorithms based on one single map representation, such as the geometric, occupancy, or ESDF map. Although they have shown success in static environments, due to the limitation of map representation, those methods cannot reliably handle static and dynamic obstacles simultaneously. To address the problem, this paper proposes a gradient-based B-spline trajectory optimization algorithm utilizing the robot's onboard vision. The depth vision enables the robot to track and represent dynamic objects geometrically based on the voxel map. The proposed optimization first adopts the circle-based guide-point algorithm to approximate the costs and gradients for avoiding static obstacles. Then, with the vision-detected moving objects, our receding-horizon distance field is simultaneously used to prevent dynamic collisions. Finally, the iterative re-guide strategy is applied to generate the collision-free trajectory. The simulation and physical experiments prove that our method can run in real-time to navigate dynamic environments safely.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Texas > Winkler County (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.50)
Realtime Safety Control for Bipedal Robots to Avoid Multiple Obstacles via CLF-CBF Constraints
Liu, Jinze, Li, Minzhe, Huang, Jiunn-Kai, Grizzle, Jessy W.
To explore safely in such environments, it is critical for robots to generate quick, yet smooth responses to any changes in the obstacles, map, and environment. In this paper, we propose a means to design and compose control barrier functions (CBFs) for multiple non-overlapping obstacles and evaluate the system on a 20-degree-of-freedom (DoF) bipedal robot. In an autonomous system, the task of avoiding obstacles is usually handled by a planning algorithm because it has access to the map of an entire environment. Given the map, the planning algorithm is then able to design a collision-free path from the robot's current position to a goal. If the map is updated due to a change in the environment, the planner then needs to update the planned path, so-called replanning, to accommodate the new environment. Such maps are typically large and contain rich information such as semantics, terrain characteristics, and uncertainty, and thus are slow to update. This raises a concern when obstacles either move into the planned path but the map has not been updated or a robot's new pose allows the detection of previously unseen obstacles. The slow update rate of the map leads to either collision or abrupt maneuvers to avoid collisions. The non-smooth aspects arising from the map updates or changes in the perceived environment can be detrimental to the stability of the overall system.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > Texas > Winkler County (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Cluster Variational Approximations for Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data
Linzner, Dominik, Koeppl, Heinz
Continuous-time Bayesian networks (CTBNs) constitute a general and powerful framework for modeling continuous-time stochastic processes on networks. This makes them particularly attractive for learning the directed structures among interacting entities. However, if the available data is incomplete, one needs to simulate the prohibitively complex CTBN dynamics. Existing approximation techniques, such as sampling and low-order variational methods, either scale unfavorably in system size, or are unsatisfactory in terms of accuracy. Inspired by recent advances in statistical physics, we present a new approximation scheme based on cluster variational methods that significantly improves upon existing variational approximations. We can analytically marginalize the parameters of the approximate CTBN, as these are of secondary importance for structure learning. This recovers a scalable scheme for direct structure learning from incomplete and noisy time-series data. Our approach outperforms existing methods in terms of scalability.
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- North America > United States > Texas > Winkler County (0.04)
- North America > Canada > Quebec > Montreal (0.04)
Cluster Variational Approximations for Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data
Linzner, Dominik, Koeppl, Heinz
Continuous-time Bayesian networks (CTBNs) constitute a general and powerful framework for modeling continuous-time stochastic processes on networks. This makes them particularly attractive for learning the directed structures among interacting entities. However, if the available data is incomplete, one needs to simulate the prohibitively complex CTBN dynamics. Existing approximation techniques, such as sampling and low-order variational methods, either scale unfavorably in system size, or are unsatisfactory in terms of accuracy. Inspired by recent advances in statistical physics, we present a new approximation scheme based on cluster-variational methods that significantly improves upon existing variational approximations. We can analytically marginalize the parameters of the approximate CTBN, as these are of secondary importance for structure learning. This recovers a scalable scheme for direct structure learning from incomplete and noisy time-series data. Our approach outperforms existing methods in terms of scalability.
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- North America > United States > Texas > Winkler County (0.04)
- North America > Canada > Quebec > Montreal (0.04)
Cluster Variational Approximations for Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data
Linzner, Dominik, Koeppl, Heinz
Continuous-time Bayesian networks (CTBNs) constitute a general and powerful framework for modeling continuous-time stochastic processes on networks. This makes them particularly attractive for learning the directed structures among interacting entities. However, if the available data is incomplete, one needs to simulate the prohibitively complex CTBN dynamics. Existing approximation techniques, such as sampling and low-order variational methods, either scale unfavorably in system size, or are unsatisfactory in terms of accuracy. Inspired by recent advances in statistical physics, we present a new approximation scheme based on cluster-variational methods significantly improving upon existing variational approximations. We can analytically marginalize the parameters of the approximate CTBN, as these are of secondary importance for structure learning. This recovers a scalable scheme for direct structure learning from incomplete and noisy time-series data. Our approach outperforms existing methods in terms of scalability.
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- North America > United States > Texas > Winkler County (0.04)
- North America > Canada > Quebec > Montreal (0.04)
Applying Perceptually Driven Cognitive Mapping to Virtual Urban Environments
Randall W. Hill, Jr., Han, Changhee, Lent, Michael van
This article describes a method for building a cognitive map of a virtual urban environment. Our routines enable virtual humans to map their environment using a realistic model of perception. We based our implementation on a computational framework proposed by Yeap and Jefferies (1999) for representing a local environment as a structure called an absolute space representation (ASR). Their algorithms compute and update ASRs from a 2-1/2-dimensional (2-1/2D) sketch of the local environment and then connect the ASRs together to form a raw cognitive map.1 Our work extends the framework developed by Yeap and Jefferies in three important ways. First, we implemented the framework in a virtual training environment, the mission rehearsal exercise (Swartout et al. 2001). Second, we developed a method for acquiring a 2- 1/2D sketch in a virtual world, a step omitted from their framework but that is essential for computing an ASR. Third, we extended the ASR algorithm to map regions that are partially visible through exits of the local space. Together, the implementation of the ASR algorithm, along with our extensions, will be useful in a wide variety of applications involving virtual humans and agents who need to perceive and reason about spatial concepts in urban environments.
- North America > United States > California > San Mateo County > Menlo Park (0.14)
- North America > United States > Florida > Orange County > Orlando (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- (9 more...)
- Education > Educational Setting (0.68)
- Government > Military > Army (0.47)
- Leisure & Entertainment > Games > Computer Games (0.46)